300 research outputs found
Opening the Black Box of wav2vec Feature Encoder
Self-supervised models, namely, wav2vec and its variants, have shown
promising results in various downstream tasks in the speech domain. However,
their inner workings are poorly understood, calling for in-depth analyses on
what the model learns. In this paper, we concentrate on the convolutional
feature encoder where its latent space is often speculated to represent
discrete acoustic units. To analyze the embedding space in a reductive manner,
we feed the synthesized audio signals, which is the summation of simple sine
waves. Through extensive experiments, we conclude that various information is
embedded inside the feature encoder representations: (1) fundamental frequency,
(2) formants, and (3) amplitude, packed with (4) sufficient temporal detail.
Further, the information incorporated inside the latent representations is
analogous to spectrograms but with a fundamental difference: latent
representations construct a metric space so that closer representations imply
acoustic similarity
Parameterized Complexity Results for General Factors in Bipartite Graphs with an Application to Constraint Programming
The NP-hard general factor problem asks, given a graph and for each vertex a
list of integers, whether the graph has a spanning subgraph where each vertex
has a degree that belongs to its assigned list. The problem remains NP-hard
even if the given graph is bipartite with partition U+V, and each vertex in U
is assigned the list {1}; this subproblem appears in the context of constraint
programming as the consistency problem for the extended global cardinality
constraint. We show that this subproblem is fixed-parameter tractable when
parameterized by the size of the second partite set V. More generally, we show
that the general factor problem for bipartite graphs, parameterized by |V|, is
fixed-parameter tractable as long as all vertices in U are assigned lists of
length 1, but becomes W[1]-hard if vertices in U are assigned lists of length
at most 2. We establish fixed-parameter tractability by reducing the problem
instance to a bounded number of acyclic instances, each of which can be solved
in polynomial time by dynamic programming.Comment: Full version of a paper that appeared in preliminary form in the
proceedings of IPEC'1
Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Uncertainty Quantification
This paper proposes an improved Goodness of Pronunciation (GoP) that utilizes
Uncertainty Quantification (UQ) for automatic speech intelligibility assessment
for dysarthric speech. Current GoP methods rely heavily on neural
network-driven overconfident predictions, which is unsuitable for assessing
dysarthric speech due to its significant acoustic differences from healthy
speech. To alleviate the problem, UQ techniques were used on GoP by 1)
normalizing the phoneme prediction (entropy, margin, maxlogit, logit-margin)
and 2) modifying the scoring function (scaling, prior normalization). As a
result, prior-normalized maxlogit GoP achieves the best performance, with a
relative increase of 5.66%, 3.91%, and 23.65% compared to the baseline GoP for
English, Korean, and Tamil, respectively. Furthermore, phoneme analysis is
conducted to identify which phoneme scores significantly correlate with
intelligibility scores in each language.Comment: Accepted to Interspeech 202
Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task Learning
Automatic assessment of dysarthric speech is essential for sustained
treatments and rehabilitation. However, obtaining atypical speech is
challenging, often leading to data scarcity issues. To tackle the problem, we
propose a novel automatic severity assessment method for dysarthric speech,
using the self-supervised model in conjunction with multi-task learning.
Wav2vec 2.0 XLS-R is jointly trained for two different tasks: severity level
classification and an auxilary automatic speech recognition (ASR). For the
baseline experiments, we employ hand-crafted features such as eGeMaps and
linguistic features, and SVM, MLP, and XGBoost classifiers. Explored on the
Korean dysarthric speech QoLT database, our model outperforms the traditional
baseline methods, with a relative percentage increase of 4.79% for
classification accuracy. In addition, the proposed model surpasses the model
trained without ASR head, achieving 10.09% relative percentage improvements.
Furthermore, we present how multi-task learning affects the severity
classification performance by analyzing the latent representations and
regularization effect
Impact of Sexual Attitude and Marital Intimacy on Sexual Satisfaction in Pregnant Couples: An Application of the Actor-Partner Interdependence Model
PURPOSE: The purpose of this study was to investigate actor and partner effects of sexual attitude and marital intimacy on sexual satisfaction in pregnant couples.
METHODS: Data were collected from 176 pairs of the pregnant couples visiting for prenatal care at hospitals from June 18 to September 24, 2016. The collected data were analyzed by paired t-test and Pearson's correlation coefficients using SPSS 18.0 and interdependent effect (Actor-Partner Interdependence Model analysis) through AMOS 18.0.
RESULTS: The sexual attitude and marital intimacy of the pregnant woman did not have a partner effect on the sexual satisfaction of her husband, respectively (β=.12, p=.141), (β=.01, p=.938). The sexual attitude of the husband had a partner effect on the sexual satisfaction of the pregnant woman (β=.13, p=.021), but the marital intimacy of the husband did not show a partner effect (β=.07, p=.202).
CONCLUSION: Study suggests that the sexual attitude and marital intimacy of pregnant couples should be considered as factors when developing an intervention to improve sexual satisfaction in couples. Moreover, pregnant couples should participate in intervention together because the sexual satisfaction has conceptual view of interdependence in two-person relationships
The change of QRS duration after pulmonary valve replacement in patients with repaired tetralogy of Fallot and pulmonary regurgitation
Purpose This study aimed to analyze changes in QRS duration and cardiothoracic ratio (CTR) following pulmonary valve replacement (PVR) in patients with tetralogy of Fallot (TOF). Methods Children and adolescents who had previously undergone total repair for TOF (n=67; median age, 16 years) who required elective PVR for pulmonary regurgitation and/or right ventricular out tract obstruction were included in this study. The QRS duration and CTR were measured pre- and postoperatively and postoperative changes were evaluated. Results Following PVR, the CTR significantly decreased (pre-PVR 57.2%±6.2%, post-PVR 53.8%±5.5%, P=0.002). The postoperative QRS duration showed a tendency to decrease (pre-PVR 162.7±26.4 msec, post-PVR 156.4±24.4 msec, P=0.124). QRS duration was greater than 180 msec in 6 patients prior to PVR. Of these, 5 patients showed a decrease in QRS duration following PVR; QRS duration was less than 180 msec in 2 patients, and QRS duration remained greater than 180 msec in 3 patients, including 2 patients with diffuse postoperative right ventricular outflow tract hypokinesis. Six patients had coexisting arrhythmias before PVR; 2 patients, atrial tachycardia; 3 patients, premature ventricular contraction; and 1 patient, premature atrial contraction. None of the patients presented with arrhythmia following PVR. Conclusion The CTR and QRS duration reduced following PVR. However, QRS duration may not decrease below 180 msec after PVR, particularly in patients with right ventricular outflow tract hypokinesis. The CTR and ECG may provide additional clinical information on changes in right ventricular volume and/or pressure in these patients
Algorithm for Finding -Vertex Out-trees and its Application to -Internal Out-branching Problem
An out-tree is an oriented tree with only one vertex of in-degree zero. A
vertex of is internal if its out-degree is positive. We design
randomized and deterministic algorithms for deciding whether an input digraph
contains a given out-tree with vertices. The algorithms are of runtime
and , respectively. We apply the
deterministic algorithm to obtain a deterministic algorithm of runtime
, where is a constant, for deciding whether an input digraph
contains a spanning out-tree with at least internal vertices. This answers
in affirmative a question of Gutin, Razgon and Kim (Proc. AAIM'08)
- …